Skip to content

修复 Codex 图片生成 ACP 输出阻塞#295

Merged
TCP404 merged 3 commits into
iOfficeAI:mainfrom
Jassy930:fix/codex-image-acp-output
Jun 16, 2026
Merged

修复 Codex 图片生成 ACP 输出阻塞#295
TCP404 merged 3 commits into
iOfficeAI:mainfrom
Jassy930:fix/codex-image-acp-output

Conversation

@Jassy930

@Jassy930 Jassy930 commented May 19, 2026

Copy link
Copy Markdown
Contributor

变更说明

  • 在 ACP ToolCallUpdate 翻译边界清洗 Codex 图片生成的大型 inline base64,只保留 saved_path/image.path 等小型结构化字段。
  • 图片已落盘时把工具状态归一为 completed,并同步 raw_output.status,避免会话一直显示执行中。
  • 增加翻译层和 stream relay 回归测试,覆盖 base64 省略、状态完成、MIME 推断和持久化内容不再携带大字段。
  • 更新中文架构文档和实施计划文档。

关联 Issue

Closes #297

验证

  • cargo fmt --check
  • cargo test -p aionui-ai-agent codex_image_tool_update -- --test-threads=1
  • cargo test -p aionui-conversation run_acp_image_tool_call_update_persists_finish_without_base64
  • git diff --check

@lornestack lornestack left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix — and for the thorough plan doc. We reproduced the bug and confirmed your root-cause analysis: codex-acp >= 0.14.0 (zed-industries/codex-acp#251, shipped in v0.14.0, which our migration 008 pins) maps Codex's ImageGenerationEnd to an ACP ToolCallUpdate whose raw_output.result embeds the whole image as base64 (a multi-MB single JSON-RPC message), and Codex often reports status: "generating" on that final event, so the tool call never reaches completed. Sanitizing at the ACP translation boundary is the right place, and the field contract (result_omitted / image.path) lines up with the AionUi frontend PR (iOfficeAI/AionUi#2935). Approach LGTM overall.

Two correctness issues need to be addressed before merge:

1. Oversized base64 still passes through when saved_path is missing

In sanitize_inline_image_result:

let should_omit = obj
    .get("result")
    .and_then(|v| v.as_str())
    .map(|result| saved_path.is_some() && is_probably_inline_image_result(result))
    .unwrap_or(false);

saved_path.is_some() gates the stripping, so whenever Codex did not save the file (older codex versions, interrupted/failed generations), the multi-MB base64 still flows into WebSocket broadcast and SQLite — which is exactly the failure mode #297 describes. Suggestion: strip any oversized inline-image result unconditionally, and let saved_path only decide whether the image { path, mime_type, source } object is added.

Suggested regression test: a ToolCallUpdate whose raw_output has a >64KB iVBORw0KGgo... result but no saved_path — assert result is removed, result_omitted == true, no image object, and the status passes through unchanged.

2. failed status gets overridden to completed

normalize_tool_status returns Completed whenever the sanitized raw_output contains image.path, regardless of the SDK status — so a ToolCallStatus::Failed reported by the agent would be rewritten to completed. Suggestion: only normalize non-terminal statuses, e.g.:

match (image_saved, sdk_status.map(map_sdk_tool_status)) {
    (true, None | Some(AcpToolCallStatus::Pending | AcpToolCallStatus::InProgress)) => {
        Some(AcpToolCallStatus::Completed)
    }
    (_, status) => status,
}

Suggested regression test: status: Failed + saved_path present — assert the event keeps failed and raw_output.status is not rewritten.

Non-blocking: docs/plans/2026-05-19-codex-image-acp-output.md is a 776-line implementation plan. We'd prefer to keep working plans out of the repo — please drop it from the PR (the ARCHITECTURE.zh-CN.md section is enough documentation).

Happy to merge once the two issues above are covered with tests.

@lornestack

Copy link
Copy Markdown
Contributor

Thanks @Jassy930 — the latest commit looks great. I went through it and confirmed all the requested changes are addressed:

  • ✅ Oversized inline-image result is now stripped unconditionally, with saved_path only gating the image { path, mime_type, source } object.
  • normalize_tool_status preserves a terminal failed and only promotes non-terminal statuses (None/Pending/InProgress) to completed.
  • ✅ Both regression tests added (codex_image_tool_update_omits_base64_without_saved_path, codex_image_tool_update_preserves_failed_status) with precise assertions — they pass locally for me.
  • ✅ The implementation-plan doc was dropped.

One thing left before this can merge: the Test CI job fails to compile (not an assertion failure) because the branch is behind main. Since this PR was branched, StreamRelay::new gained a turn_id parameter (now 7 args):

pub fn new(conversation_id, msg_id, turn_id, user_id, repo, broadcaster, cron_service)

The new run_acp_image_tool_call_update_persists_finish_without_base64 test still calls it with the old 6-arg form, so CI reports:

error[E0061]: this function takes 7 arguments but 6 arguments were supplied
error[E0277]: `BroadcastEventBus: aionui_db::IConversationRepository` is not satisfied

Could you merge/rebase the latest main and add the turn_id argument to that StreamRelay::new(...) call (e.g. "turn-1".into() between msg_id and user_id)? The translate.rs / events changes themselves merge cleanly — this stale StreamRelay::new signature is the only drift. Once CI is green I will approve. Thanks again for the thorough work!

@Jassy930

Jassy930 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author
image

Jassy930 added 3 commits June 15, 2026 20:36
Strip oversized inline-image base64 from ToolCallUpdate raw_output
unconditionally, regardless of whether saved_path is present. Older
codex versions and interrupted/failed generations emit multi-MB base64
without saved_path; that payload must not reach WebSocket or SQLite
(the failure mode in iOfficeAI#297). saved_path now only decides whether the
structured image{path,mime_type,source} object is attached.

Restrict status normalization to non-terminal statuses so a genuine
Failed reported by the agent is preserved instead of being rewritten
to completed.

Add regression tests:
- oversized base64 without saved_path is stripped, no image object,
  status passes through unchanged
- Failed + saved_path keeps failed and does not rewrite raw_output.status

Drop the 776-line working plan doc from the repo per review.
@Jassy930 Jassy930 force-pushed the fix/codex-image-acp-output branch from a603d37 to 38f94cd Compare June 15, 2026 12:37
@Jassy930

Copy link
Copy Markdown
Contributor Author

配套关系标记:这个后端 PR 负责保留/持久化 Codex ACP image tool output 里前端可展示的 image patch 数据;聊天框真正展示图片还需要前端 PR iOfficeAI/AionUi#2935 一起合入。单独合入后端时,数据侧会具备图片输出,但当前 UI 不会完成图片预览渲染。

@lornestack lornestack left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — all requested changes are in and CI is green.

Re-verified at head 38f94cd:

  • Oversized inline-image result is stripped unconditionally; saved_path only gates the structured image { path, mime_type, source } object.
  • normalize_tool_status preserves a terminal failed and only promotes non-terminal statuses to completed.
  • The run_acp_image_tool_call_update_persists_finish_without_base64 test now matches the current 7-arg StreamRelay::new signature.
  • Full Test suite + Clippy + Format all pass.

Sanitizing at the ACP translation boundary is the right call, and the result_omitted / image.path contract lines up with the AionUi frontend PR (iOfficeAI/AionUi#2935). Thanks @Jassy930 for the careful follow-up — merging can proceed once the frontend PR is ready to go in together.

@TCP404

TCP404 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

已补一个 PR,用来收紧 ACP image path/status 判断:#477

覆盖点:

  • saved_path 是图片路径且没有 inline result 时也补 image 并完成;
  • 非图片 saved_path 不再误判为图片输出;
  • 空 image path 不再触发 completed 归一化。

@TCP404 TCP404 merged commit 3f055d6 into iOfficeAI:main Jun 16, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

现在使用aionui调用codex时,如果让codex生成图片了,会话会卡住

3 participants